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Automated Assay for Identification of Individual Cells During Kinetic Assays 

5 Cross Reference 

This application claims priority to U.S. Provisional Application for patent 
serial no. 60/258,147 filed December 22, 2000. 

Field of the Invention 

1 0 The application relates to kinetic cell-based screening. 

Background of the Invention 

Kinetic assays are performed by making measurements at a series of points in 
time to measure the change of a sample. The measurements at any one time point 
15 might also be used for a non-kinetic assay, here called a fixed endpoint assay. Fixed 
endpoint assays are sufficient for samples that exhibit little or no change over the 
duration of the assay. If the sample changes over time, kinetic measurements are 
required to measure those changes. Mathematical descriptions of the trends in various 
cell parameters over time represent kinetic features that are distinct from the 
20 measurements calculated in fixed endpoint assays. 

Kinetic assays are performed on the same sample over time and are distinct 
from common experiments that provide an approximation of kinetic features with 
fixed endpoint assays on different portions of a sample. For example, if the sample is 
a population of cells comprising a number of similar individual cells, changes in the 
25 population over time can be measured by assaying portions of the sample with a series 
of fixed endpoint assays. This approach is commonly used in biochemical or 
immunohistochemical assays when samples are killed (i.e., fixed) or destroyed during 
the assay. A series of fixed endpoint assays makes measurements on individual cells, 
but the particular individuals within each population are different at each fixed 
30 endpoint assay and cannot be related to each other on the cell level. A series of fixed 
endpoint assays provides useful kinetic information only when the population average 
measurements are assumed to be related from portion to portion of the sample and the 
individual cells in the population are assumed to be equivalent. 



1 



WO 02/061423 PCTAJSO 1/49928 

The fixed endpoint approach is insufficient if the cells in the sample are not 
equivalent or if the changes must be related over time on a cell-by-cell basis. 
Measurements of physiologically relevant cells are heterogeneous; reflecting the. 
normal variability of cell behavior in an intact animal. The heterogeneity often 
5 includes important information on the physiology of cells in the living state, and 
biologically relevant measurements must include, not exclude the variability of the 
sample. Living cells that change independently of each other must be measured at 
multiple times and the measurements correlated over time on a cell-by-cell basis. 

A true kinetic assay addresses problems by providing measurements on single 

10 cells correlated through time. Generally, cells are identified by position and by other 
characteristics to provide continuity of cell level biological measurement at each time. 
A typical problem to be overcome is positional uncertainty of cells due to movement 
of cells or the measuring instrument. The ability to identify cells over time allows the 
user to measure and account for sample variability, and subpopulation behavior. The 

1 5 whole population response, of a sample is often due to the activity of just a 
subpopulation of cells. Accurate kinetic measurement of subpopulations provides 
higher content information about physiological, or pharmacological response of a 
biological sample. Cell-based kinetic measurements also allow multiple 
measurements of the same sample (multiparametric assays) to be correlated on the 

20 cell level, connecting measurements of different cellular functions and mechanisms, 
and thus providing a better mechanistic understanding of cells and drugs that affect 
them. 

Therefore, methods for tracking individual cells during a kinetic cell screening 
assay are needed in the art. 

25 

Summary of the Invention 

The present invention provides methods and software for tracking individual 
cells during a kinetic cell screening assay, comprising: 

a) providing cells that possess at least a first luminescently labeled 
30 reporter molecule that reports on a cell structure; 

b) obtaining a structure image from luminescent signals from the at least 
first luminescently labeled reporter molecule in the cells in a field of view; 

c) creating a structure mask for individual cells in the field of view; 

d) defining a reference point of each structure mask; 
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e) assigning an cell identification to each reference point in the field of 

view; 

f) repeating steps (b) through (e) at a second time point; 

g) correlating cell identification between the first time point and the 
5 second time point by calculating a distance between reference points in the field of 

view at the first time point and reference points in the field of view at the second time 
point; and 

h) defining a cell identification match by identifying reference points in 
the field of view at the first time point and reference points in the field of view at the 

10 second time point that are closest together. 

In a preferred embodiment, steps (f) - (h) are repeated a desired number of 
time points, wherein determining the distance between reference points is done by 
determining a distance between reference points in successive time points, and 
wherein defining the closest cell identification match is done by defining the closest 

1 5 cell identification match in successive time points. 

In another preferred embodiments include assigning a quality score to the cell 
identification match based on a distance determined for a second closest cell 
identification match, wherein a cell identification match is rejected if the quality score 
is below a user-defined threshold for a quality score. 

20 A further preferred embodiment comprises comparing other features of the 

individual cells between successive time points in order to facilitate cell identification. 

Brief Description of the Figures 

Figure 1 is a flow chart showing one embodiment of the method for tracking 
25 individual cells during a kinetic cell screening assay- 
Detailed Description of the Preferred Embodiments 

In kinetic assays, cells may move around, enter or leave the field, grow, 

30 shrink, or divide; also, separate cells may move into or out of contact with each other. 

In determining features for individual cells over time, it is preferable to optimize 

correct identification of individual cells from timepoint to timepoint. Thus, after 

collecting the data for a current timepoint, a second cell identification is reconciled 

against a cell identification obtained from the first timepoint in the kinetic scan for the 

3 
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well. This will ensure that the kinetic data is associated with the correct cell 
throughout all timepoints of the kinetic scan. After obtaining the cell, field, well, and 
plate level data for the current timepoint, the kinetic data is integrated with any 
previous kinetic data to form the kinetic features for individual cells, from which 
5 field-based, well-based, and/or plate-based kinetic features pertaining to any desired 
cell screening assay can be derived. 

Methods for reconciling cell identification across different time points help 
insure that any given cell has the same identification from image to image in the 
image series. 

10 The present invention provides methods and software for tracking individual 

cells during a kinetic cell screening assay, comprising: 

a) providing cells that possess at least a first luminescently labeled 
reporter molecule that reports on a cell structure; 

b) obtaining a structure image from luminescent signals from the at least 
1 5 first luminescently labeled reporter molecule in the cells in a field of view; 

c) creating a structure mask for individual cells in the field of view; 

d) defining a reference point of each structure mask; 

e) assigning an cell identification to each reference point in the field of 

view; 

20 f) repeating steps (b) through (e) at a second time point; 

g) correlating cell identification between the first time point and the 
second time point by calculating a distance between reference points in the field of 
view at the first time point and reference points in the field of view at the second time 
point; and 

25 h) defining a cell identification match by identifying reference points in , 

the field of view at the first time point and reference points in the field of view at the 
second time point that are closest together. 

As used herein, the term "image" means a digital representation of the 
30 optically detectable signals from the at least first optically detectable reporter 
molecule, and does not require a specific arrangement or display of the digital 
representation. Images are parcels of information derived from the sample that are 
organized in various ways for the convenience of the observer. In preferred . 
embodiments, well known formats for such "images" are employed, including but not 
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limited to DIB, TIFF, BMP, picture element (pixel) maps, three-dimensional volume 
arrays, two dimensional surface or cross section arrays, or one dimensional line scan 
images, oscilloscope time traces, orthogonal arrays of integers, pixel intensity 
numbers, hexagonal grids of integers, floating point pixels, and planar, chunky or 

5 Bayer pattern arrays of multispectral pixel arrays. In a most preferred embodiment, 
picture element (pixel) map images are used, such as those produced by optical 
cameras where spatial location in one plane (X, Y) within the sample is represented 
by spatial location within the map (x, y) and luminescent sample intensity (I) is 
represented by the signal amplitude or value (i) at each pixel. 

10 The Field Of View (FOV) is the area that is imaged. It is equivalent to the 

image size. The dimension of the FOV can either be expressed in microns at the scale 
of the sample area, or in pixels of the image size. The cell sample area is generally 
much larger than the FOV, such as for a medium or high resolution image of a 96, or 
384 well plate. 

15 As used herein an "optically detectable reporter molecule" is a reporter 

molecule that can emit, reflect, or absorb light, and includes, but is not limited to, 
fluorescent, luminescent, and chemiluminescent reporter molecules. In a preferred 
embodiment, a fluorescent reporter molecule is used. 

The cell structure reported on by the optically detectable reporter molecule can 
20 be any detectable cell structure, including nuclei, intracellular organelles, cytosol 
markers, and plasma membrane markers. In the simplest case, the cell structure is 
present as a single entity in the cell, such as the nucleus. 

As used herein, the reporter molecule "reports on" the cell structure by 
processes. including, but not limited to, binding to the cell structure, either directly or 
25 indirectly, and by being incorporated into or contained within the cell structure. 

As used herein, the "reference point" is a single point defined relative to the 
cell structure, including but not limited to a center of the cell structure, a center of 
mass of the cell structure, a centroid (defined as a geometric center) of the cell 
structure, or by drawing a bounding box around the cell structure, wherein the point 
30 can .be defined, for. example, as the intersection of any two diagonals within the 
bounding box. In a preferred embodiment, a centroid of the cell structure is used. 

Images are acquired of the at least first optically detectable reporter molecule, 
and the images can optionally be preprocessed (shade corrected and smoothed). The 
images are then thresholded (preferably using an automatic thresholding procedure), 
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producing a structure mask. In a further preferred embodiment, the cell structure is a 
nucleus, wherein the structure image is a nuclear image, and wherein the structure 
mask is a nuclear mask: As used herein, the term "mask" means a processed version 
of the cell structure image to fill holes. Creation of a mask preferably comprises 
5 thresholding the image to select relevant image components with values (position, 
intensity) above background outside of the structures of interest. 
As used herein, the following terms are defined as below: 
A cell that is entirely within the FOV is termed an "FOV cell". These are the 
cells that can be analyzed. 
10 A cell that is entirely outside the FOV is a "Non-FOV cell". These cells are 

not analyzed. 

A Boundary Cell is defined as a cell touching the FOV boundary. Most feature 
measurements of these cells would be incomplete or inaccurate, and thus Boundary 
Cells are preferably discounted. However, a Boundary Cell can be considered an 
15 intermediate state that can be tracked if desired. 

A Departure is defined as a cell leaving, in its entirety, the FOV from any 
direction. The cell needs io be completely outside the FOV to be called a Departure. 

Cells in motion may arrive and depart from the FOV at any time. An Arrival is 
defined a!s a cell entering, in its entirety, the FOV from any direction. The cell needs 
20 to be completely inside the FOV to be called an Arrival, because until then, it would 
be an incomplete boundary cell that is generally not analyzed. 

Arrivals and Departures add to the complexity of tracking because they 
require a more complex administration of which cells exist throughout the extent of 
the entire movie. If all cells exist at all time points, this administration would be a 
25 simple array that can be established from analysis of the first time point. If cells are 
not present at certain time points, it requires an analysis of the full image series to 
build up this inventory and more elaborate data management. 

A Create event is defined as a cell appearing "out of the blue" anywhere in the 
FOV but not on the edge. For example, a cell may not have had enough labeling 
30 intensity to be detected at first, but during the course of the image series it responded 
and became visible. If a cell appears on the edge, it would be an arrival. A Destroy 
event is defined as a cell disappearing from the FOV but not moving out as a 
departure. For example, a cell may die and somehow lose its labeling marker). 
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There are three general . embodiments of the methods and software for 
tracking individual cells during a kinetic cell screening assay of the invention. Each 
method provides alternatives to the basic method, each with added sophistication for 
rejecting fewer cells and providing increased robustness. 

5 l. Simple proximity method: In one embodiment, determining a cell 

identification match comprises identifying reference points in the field of view at a 
first time point and reference points in the field of view at a second time point that are 
closest together, and assigning the appropriate cell identification to the cell at the later 
time point in the image series. This should be successful if most or all of the cells are 

10 slow moving (considering the frame rate). Any other data needed to do the 
comparison at subsequent time points in the image series is stored, including but not 
limited to the reference points of all the cells from the immediately preceding time 
point, and the cell ID's which were assigned to each of those cells (ID Mapping 
Table). In a preferred embodiment, the cell identification match is rejected if it falls 

15 below a user-defined threshold for a cell identification match. For example, the user 
can determine a maximum reasonable distance that cells can move between time point 
(i.e. a maximum rate of motion), or thresholding can be used to select relevant image 
components with values (position, intensity) above background outside of the 
structures of interest. ' 

20 The successive sets of reference points is preferably matched up as follows. 

For each cell in the current set, its distance to each reference point in the immediately 

preceding timepoint is determined. The two closest preceding cells are determined. 

The closest previous cell is assigned to the current cell, and a quality score (between 0 

and 100) is assigned to the match, which increases as the relative distance of the 

25 second best match increases. In a preferred embodiment, the quality score is 

calculated according to the formula: 

Quality = 100 * (SecondBestDistance - BestDistance) / SecondBestDistance 

This is preferably used when the distances that the cells moved are small 

enough so that there is not confusion as to which cell moved where. A quality of 

30 match is computed to estimate this. The quality of . match is 100% if there is no 

confusion, and .0% if there is an equal chance that the cell could have been a 

neighboring cell. In a further preferred embodiment, a cell identification match is 

rejected if the quality score is below a user-defined threshold for a quality score. The * 

threshold can be defined in various ways, such as those described above,, or, given a 
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specific experimental situation for the cells, the user can predict the likelihood of cells 
being created or destroyed and the acceptable quality score of matching can be set 
accordingly. For example, the threshold for an acceptable quality of match score can 
be set be lower if the user is not expecting Create and Destroy events. Cells/artifacts 

5 are removed from analysis if they do not map uniquely to a cell ID. 

2. Total Distance Minimization: If the Simple Proximity Method results in 
ambiguous matches (e.g. low quality scores due to two cells equidistant), a global 
matching may be performed as well. Thus, in a further embodiment, the method 
further comprises determining a total sum of all distances or distances squared for all 

10 possible cell identification matches in successive time points, wherein a smallest total 
sum of all distances or distances squared is defined as a closest set of cell 
identification matches. 

A matrix of distances between each current and each previous cell is 
computed. Every possible permutation of the cells, [there are N! permutations]) is 

1 5 scored by summing the distances (or the squares of distances) for all its pairs, with the 
lowest total being the best cell identification match. In a further preferred 
embodiment, the iamount of computation can be reduced by pre-pairing (using the 
Simple Proximity Method) any matches with quality scores over a preset threshold, 
and then excluding the cells in those pre-pairings from the global matching process. 

20 This last adjustment will work very well if the cells vary widely in movement rate. 
Alternatively, the method can reduce the amount of computation by excluding those 
cells in pre-pairing that fall below a user define threshold from the global matching 
process. 

In a further preferred embodiment, the methods further comprise assigning a 
25 quality score to the cell identification match based on a sum of distances or distances 
squared determined for a second closest cell identification match, and wherein a cell 
identification match is rejected if the quality score is below a user-defined threshold 
for a quality score. 

30 3. Total Distance & Feature Matching Minimization: In a further preferred 

embodiment, defining the cell identification match further comprises comparing other 
characteristic features of the individual cells between successive time points in order 
to identify cells, and comparing the measurements for any proposed match. Applied 
to individual cells, this is a way of efficiently resolving individual ambiguities. As 

8 
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part of a more elaborate method (e.g.: as a follow up to the Total Distance 
Minimization Method), the feature sets constitute a matrix of better data, which is 
compared with the vectors for the previous timepoint, , and minimizing the weighted 
sum of differences (or differences squared) as the measure of matching. The matrix 
5 would now be better called a confusion matrix, where each position is a compounded 
number containing the distance + any other cell feature matching values. 

Quality of Cell Match 

In contrast to the measurement of a quality score based on the simple 

10 proximity method, a quality score for the Total Distance & Feature Matching 
Minimization method is further based on quality of the match based on one or more of 
any number of cell features, including but not limited to a) actual available features of 
the cell or subcellular structures, such as fluorescent intensity, cell area, cell shape, 
etc.; and/or b) additionally created features of the cell such as exogenous tags (i.e.: 

15 tags associated with the cells, solely for the purpose of cell tracking), such as "bar- 
coding tags" (discussed below). The algorithm is designed to work with any set and 
any number of features, which may change for different assays, cell types, etc. 

While the analysis of cell features in determining a quality score can be 
incorporated into the simple proximity method (e.g.: carried out for possible cell 

20 identification matches being analyzed), it is preferred to "pre-pair" cell identification 
matches via the simple proximity method, and carry out feature analysis only when 
necessary on those cell identification matches that are ambiguous using the simple 
proximity method. 

Since the cell can change shape and other cell features over time, the quality 
25 score is never absolutely perfect Conversely, different cells may possess similar cell 
features, and thus can yield a relatively high score for the quality score. Each cell 
feature may have a different value of contribution to the matching problem. Cell 
features that have more variation between cells, such as a unique identifier (nuclear 
texture, intensity, or position with respect to other cell structures such as the 
30 perinuclear Golgi apparatus, are preferably accorded more weight than those that 
show less variation between cells such as the position of the whole cell reference 
point with respect to a nuclear reference point. . This preferred embodiment 
comprises according a weight factor for each cell feature for calculation of the quality 
score; : In one embodiment, a user provides those weight factors. In another 

9 
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embodiment, the weight factor is computed from learning sets and applying a Bayes 
classifier or other technique. 

In a. preferred embodiment, the quality score is determined by first calculating 
its reciprocal, i.e. the difference between cells. This "Mismatch" (preferably 
5 weighted) is the sum of the differences between cell features. In a preferred 
embodiment, the MisMatch between an cell 1 and cell 2 is expressed as follows: 

MisMatch = S {W a • DIFF(F a , , F a2 ) } 

10 Where: 

"a" is each cell feature being used 

W a is the weight factor for feature a 

F a i is the feature a computed for cell 1 

DlFF(F a i , Fa2) is the difference function between cell feature a 

1 5 computed for cell 1 and cell feature a computed for cell 2. 

The DIFF() function can be defined, for example, as: 

DIFF(x,y) = ABS(x-y); (wherein "ABS" means the "absolute 
number") or 

20 DIFF(x,y) = (x-y)-(x-y) 

The square of the difference helps in making the function "steeper". 

For example, one or more of the following cell features can be assiessed: 
a) Cell size 
25 b) Average cell fluorescent intensity 

c) Cell P2A or shape factor 
For these features, the weight factor (Wa, Wb and Wc, respectively) are 
preferably set to 1 .0. For example, the weight of each cell feature can be reduced by 
using a weighting factor that is a fraction between 0 and 1 while the weight of each 
30 cell feature can be increased by using a weighting factor greater than 1 . The array of 
weight factors is given as an input to the algorithm, so it can be easily adapted as 
needed. 

The quality score is simply the reciprocal of the MisMatch: 
quality score = 1 / MisMatch 
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For a "perfect match" the MisMatch is zero, and hence the quality score is 
infinitely good. 

Possible Limitations on the Quality Score 

5 In some instances, the cells may be too "plain" to extract distinctive cell 

features from them. For example, they may all look like spheres without texture. One 
way to alleviate this problem is to examine as many unique cell features as possible. 
For example, multiple fluorescence channels can be analyzed to generate more cell 
features, for example by labeling multiple structures such as nuclei and Golgi 

10 apparatus. Generally, the desirable characteristics of cell features for identifying cells 
include distinction from neighboring cells and constancy over time. 

In a preferred embodiment, a "Bar Coding" scheme is implemented to get 
even more distinct features added to the cells. Generally, the desirable characteristics 
of barcoding particles for identifying cells include distinction from each other, from 

15 cell-to-cell, and constancy over time. Particles for "bar-coding" cells are available in 
mixtures of varying intensity, color and size (fluorescent beads of different size and 
intensity from Bangs Labs, or sets of multispectral Quantum Dots contained within 
beads, for example), so that most cells can be associated with a particle or set of 
particles possessing unique features which can thus be uniquely accounted as unique 

20 cell features. Bar code particles can be contained within cells by random distribution 
to cells and natural phagocytosis of the particles. Alternate methods can be employed 
to increase the yield of labeled particles and the uniformity of labeling, including 
physical projection or injection of particles, or by depositing cells onto ordered arrays 
of barcode particles deposited on substrates to control the number and distribution of 

25 particles delivered to cells. Barcode particles need not be associated on a perfect one- 
to-one basis with cells to provide value for cell identification. The methods described 
here are fault tolerant and imperfect bar coding contributes to cell identification even 
if barcodes are not contained within every cell or if barcodes are repeated 
occasionally within the image. Barcode particles can be observationally associated 

30 with cells by, for example, their proximity to a labeled nuclei or other cell structure or 

by being contained within the cell periphery. In these instances, the "bar code" 

features are treated just like any other cell feature in the quality score equation above. 

In favorable instances, distribution, uniqueness and universality of the bar coding 

particles is sufficient and no supportive biological structures are required to associate 
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particles with unique cells. If the bar coding technique is very high quality with a 
majority of cell containing a unique barcode, the weight factor for the bar code cell 
feature can be very high, completely supplanting the need to label endogenous cell 
structures. A less stringent bar coding scheme is given a lower weight factor and 
5 simply contributes a part in the process of matching. 

In other instances, the cells may change their shape and cell features so much 
when imaged from timepoint to timepoint that their identification at different time 
points as the same cell is difficult. This problem is alleviated by sampling often 
enough in time (increased Sampling Frequency), to ensure the variability over time is 

10 less than the variability between cells. 

The Sampling Frequency means the number of image acquisitions per minute. 
An insufficient sampling frequency reduces the ability to effectively track cells, or 
measure fast cellular events. An excessive sampling frequency may damage the cells 
due to phototoxicity. An optimal sampling frequency will thus vary depending on 

15 various factors, including cell motion, cell density, the cellular event being analyzed, 
and the probabilities of arrivals (move into* FOV), departures (leave FOV), and 
collisions with other cells. For example high speed calcium changes may require a 
faster sampling frequency to satisfy the tracking needs than most assays. In general, 
an optimal sampling frequency is the minimum frequency needed to be able to 

20 reconstruct a signal with arbitrary precision (ie: the Nyquist sampling frequency). One 
way to find this frequency is to look at the Fourier spectrum of the original signal and 
find the highest frequency component. The Nyquist sampling frequency is twice that 
frequency/ Sampling below the Nyquist frequency may not allow reconstruction of 
the higher frequency components of the signal and, may produce aliasing artifacts. , 

25 It is also desirable to optimize the "Yield" of the kinetic assay. The yield can 

be expressed as an absolute number of cells that maintain a Free Path (i.e.: no 
collisions), or as the percentage of those cells compared to the total cells. The 
probability of a Free Path is the likelihood of a cell not being involved in any 
collisions, and not leaving or entering the FOV during the entire image acquisition. 

30 This probability will go down the longer kinetic data is acquired, since sufficient cell 
motion can eventually cause all cells to collide or move from the FOV, and is 
dependent on the cell motion and the cell density. 

Given a particular cell density (e.g.: number of cells per square area), a user 
can compute average distance between cells. If a FOV has an extremely high cell 
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density, it will have the potential for a high yield, but probably all cells will be 
colliding within the first few image acquisitions, reducing the yield to zero. Hence, it 
is useful to determine an optimal cell density to produce an optimal yield, for a given 
set of cell behavior parameters.- The optimal celf density will vary based on all of the 
5 various factors discussed herein, and thus is preferably determined experimentally or 
by computer simulation. For example, the optimal experimental cell density will 
depend on the biological function of cells to be measured and on the statistical error 
desired for measurement of the sample. 

An optimal cell density, accounting for biological variables, is between 10 
1 0 and 50 % of confluency. 

The average distance between cells may need to be corrected for cell 
confluency (e.g.: percentage of cells that are touching other cells) or cell clustering. 
Given an average cell motion speed estimate, we can set the maximum sampling 
frequency allowed to satisfy, the "Nyquist" criterion. The cell motion speed is 
15 preferably expressed as an average distance traveled per time point of the image 
acquisition. Cell motion can also be described by defining its speed and persistence 
in a direction (Directed motion), by a diffusion coefficient (Brownian motion), and/or 
by defining an affinity factor, which reflects the effect of nearby cells on the motion 
of a cell. 

20 

Rolling Average of the Quality Score 

The quality score can be averaged over multiple time points and applied to a 

later time point as an average quality score. This "rolling average" will become part 

of the feature vector computed for each cell at each timepoint. This way, it is carried 

25 forward during the analysis of each image acquisition, without the need to access the 

entire image acquisition series. 

In a preferred embodiment, at time point t, this is defined as: 

Average quality scoret = (1 - k) • Average quality score^j) + k • quality score t 

where k is constant to define the weight factor of this geometric average. The 

30 value of k can be determined experimentally by providing the best fit with sample 

truth data where cell identification is pre-determined. The choice of k depends on 

the sampling frequency relative to the amount of change in the cells, and on the 

desired amount of smoothing of the feature over time. A value of k close to 1 will do 

little or no smoothing, while a value close to zero will do a lot of smoothing. The 

13 
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average quality score value is set it with a value that reflects expectation at the 
beginning of the image acquisition series. The method does not require control or 
truth data but the parameters used to calibrate the method for a specific biological 
sample . are preferably derived experimentally from control data that matches the . 
5 experimental sample in the measures used for cell identification. For example, 
control experiments can be run, or a reasonable expectation for that value can be 
provided. 

4. Reduction of the Confusion Matrix: The computational cost of the 
10 Total Distance & Feature Matching Minimization method and a complete confusion 
matrix can be quite high, and grows rapidly with the number of cells. Therefore, in a 
preferred embodiment, the computationally less intensive Simple Proximity method 
is used first, and only those cell identification matches that are ambiguous are 
subjected to confusion matrix analysis, as necessary. 

15 

The strategy looks as follows: 

a) Try to match cells based on Simple Proximity 

b) Identify problem areas where Simple Proximity may not work 

c) Compute confusion matrices for those areas - on a limited set of cells 
20 d) Solve confusion matrices for the problem areas 

Examples 

Determining when the Simple Proximity method is insufficient 

As described in the previous version, we can consider two examples. Then we 
25 proceed with the strategy to assign the right method to the job. 

1. Very simple case: 

Three cells (a, b and c) , each one moves a bit to the right and they will become A,B 
and C in the next time point: 



30 



35 



a A 

c C 

// Program output. . . 



b B 
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Test 1 ... 3 cells move to the right 

Previous Image: 

Label Xcm Ycm CelllD 

5.0 100 100 1 

1 200 200 2 

2 300 300 3 



New Image: 



Label 


Xcm 


Ycm 


CelHD 


Quality. 


New? 


dY 


dX 


Distance 


0 ■ 


120 


100 


1 


100% 


Old 


20 


0 


20.00 


1 


225 


200 


2 * 


100% 


Old 


25 


0 


25.00 


2 


330 


300. 


3 


100% 


Old 


30 


0 


30.00 



15 



2 

269.07 
125.00 
30.00 



25 2. The more difficult case: 

Three cells, each one moves a bit to the right. This sounds the same as #l 5 but 

now the proximity of the new locations makes the situation confusing: 



a Ab B 

30 c C 

This situation is almost the same as the simple example, but forms a major 
tracking problem. For example: A and C are closer to b than to a or c. Using the 
simple proximity method, the following results are obtained: 
35 a is lost, b moved to A or C, B is a "new cell" and c moved to A 

This situation requires the more complex proximity matrix and competitive 
matching portion of the algorithm to come up with a best global fit for all cells 
involved. The algorithm can track this, by using the Distance Minimization function 
of the algorithm. (See below) 

40 

// Program output. . . 

Test 4... 3 cells move to the right, but are too close for simple matching 



Previous Image: 



45 



Label 


Xcm 


Ycm 


CelllD 


0 


100 


100 


1 


1 


125 


100 


2 


2 


105 


105 


3 



50 Simple proximity method: 

15 



Distance matrix: 
0 1 
20 0 20.00 128.06 

1 160.08 25.00 

2 304.80 1 64.01 
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New Image: 
Label Xcm 

0 120 

1 145 

2 125 



Ycm CelUD Quality New? 

100 2 100% Old 

100 3 12% Old 

105 1 100% Old 



dX 


dY 


Distance 


-5 


0 


5.00 


40 


-5 . 


40.00 


2 


5 


25.00 



Simple proximity method failed 

Now look at the distance matrix more closely: 

Distance matrix: 

0 1 2 

0 20.00 5.00 15.81 

1 45.00 20.00 40.31 

2 25.50 5.00 20.00 

matching permutation 

0 1 2 

1 I I 



computed total distance weight 



20.00 + 20.00 + 20.00 = 60.0 
20.00+ 5.00 + 40.31 =65.31 
45:00+ 5.00 + 20.00 = 75.0 
45.00+ 5.00+ 15.81 =65.81 
25.50 + 20.00+ 15.81 =61.31 
25.50+ 5.00 + 40.31 =75.81 



Hence 0 1 2 is the best matching sequence 

Total Distance Minimization method: 
New Image: 





Xcm 


Ycm 


CelllD 


Quality 


New? 


dX 


dY 


Distance 


0 


120 


100 


0 


100% 


Old 


20 


0 


20.00 


1 


145 


100 


1 


12% 


Old 


20 


0 


20.00 


2 


125 


105 


2 


100% 


Old 


20 


0 


20.00 



10 



15 



20 



25 



30 



So how do we know we could be dealing with an example 2 instead of 1 ? 

35 If there are lots of close contenders in the area, one could simply assume the 

simple proximity method will run in to its limitations. 

Secondly, Create and Destroy events suggest the existence of an "aliasing" 

effect. It may be difficult to distinguish Create and Arrival events if the sampling 

frequency is too low to make a proper judgment. At a low sampling frequency, a cell 
40 that suddenly appears near the boundary could have been a creation, or could simply 

have moved in quickly as an arrival. The same applies for Destroy and Departure 

events. Note that an Arrival may also occur if a cell enters the FOV "from above". 

This means a cell is floating higher than the depth of field and lands in the FOV. 

Most observed Create and Destroy events are caused by artifacts, such as a focus or 
45 signal-to-noise problem. If the problem is corrected in a subsequent time point, the 

same cell will show up as a Create event. 

For example: 
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a Ab B 

The simple conclusion would be: 
5 a is destroyed, b moves to A, and B is new creation. . 

In addition to the Create and Destroy clues, one can use an Aver 
LastMoveDist value to define a "sphere of influence". 

Last Distance Moved 

1 0 To assess the amount of movement expected from an individual cell, the 

distance moved from the previous time point can be recorded. Although past 
performance is not a true indication of how much the cell may move now, it is better 
than no indication at all. 

In a preferred embodiment, at time point t, this is defined as: 

1 5 LastMoveDist, = SQRT( (posx (t _ 2 ) - posx (t .i) ) 2 + (posy (l _ 2 ) - posy (t .,) ) 2 ) 

This value needs to be set it with a value that reflects expectation at the 
beginning of the image acquisition set For example, control experiments . can be run, 
or a reasonable expectation for that value can be provided. 

20 

Rolling Average of Last Distance Moved 

Since the motion of a cell can seem erratic, it is preferred to average a few 
time points rather than using a single time point. Thus, a further preferred 
embodiment comprises carrying forward a rolling average of distance moved to each 
25 new time.point. 

In a preferred embodiment, at time point t, this is defined as: 

Average LastMoveDist, = (1 - k) • LastMoveDist(,.i) + k • LastMoveDist, 

Where k is constant to define the weight factor of this geometric average. The 
30 choice of k depends on the sampling frequency relative to the amount of change in the 
cells, and on the desired amount of smoothing of the feature over time. A value of k 
close to 1 will do little or no smoothing, while a value close to zero will do a lot of 
smoothing. The Average LastMoveDist value needs to be set with a value that 
reflects expectation at the beginning of the image acquisition set. For example, 

17 
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control experiments can be run, or a reasonable expectation for that value can be 
provided. 

In a preferred embodiment, identifying cells or groups of cells that require 
analysis by the confusion matrix comprises: 
5 1 . Find the largest Average LastMoveDist of all cells in this field. This is a good 
indication of the motility of these cells. This number can be multiplied by a safety 
factor, for example by 1.3, to allow for e.g. 30% more motility than was previously 
seen. The only cost of increasing this number is computation time. 
2. For each cell, compute a sphere of influence using this inflated Average 
10 LastMoveDist number. The purpose ( is to generate a large enough sphere, to assure 
not generating^/se Create and Destroy Events. However, the sphere is small enough 
(i.e.: preferably 10 cells or less) so that the confusion matrix of all cells inside the 
sphere of influence does not become so large that it becomes too computationally 
intensive. 

15 3. Merge spheres by propagation if they overlap. For example, when two 
spatially distinct clusters of cells share only one cell that is close enough be part of 
either cluster, those clusters need to be merged into one. 

4. The spheres result in groups of cells that "may have something to do with each 
other." They are not really spheres anymore by that time, just a list of cell ID's. Any 
20 time that there is more than one contender in a sphere, it can be assumed that the 
simple proximity method is inadequate, and the more complex matching methods are 
utilized. 

25 Confusion Matrix 

If the previous step identifies the need for complex matching, a confusion 
matrix can be computed. In one embodiment, the confusion matrix is conducted on 
small groups of cells, preferably less than twenty cells, and even more preferably 
30 fifteen cells or fewer. 

For example, if there are three cells in the group, a vector such as the one 
below is created: 



35 



MM,,, . MM, , 2 * MM, t3 

MM 2 .i MM 2 ,2 MM 2 , 3 



u 
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MM 3|] MM 3 ,2 MM33 

5 where MM (r 2 is the MisMatch of Cell 1 compared with Cell 2, etc. The 

computed distance between the cells can be added to the MM matrix elements at this 
point and use it in the same computation as another matching feature. 

Real Arrival/Departure and Create/Destroy Events 
10 Using the above matrix will always generate a match, even if there are 

Arrival/Departure and create /destroy events. 

It is preferred that there be a limit at which a match is rejected, and at that 

point a create and/ or a destroy event is present. The average quality score can be 

used for this purpose. This figure can be multiplied by an allowance factor to come up 
15 with a threshold value. The "allowance factor" is preferably arrive at by balancing the 

likelihood of a Create/Destroy event with the performance of the tracking precision. 

The threshold can also be set externally, if enough learning data sets of specific cell 

types and assays have been generated by which to establish an appropriate threshold. 

20 Reduction of the Confusion Matrix 

The confusion matrix can become very hard to solve if the number of cells in a 
confusion cluster is larger than about 10-20 cells. The number of permutations that 
need to be evaluated is proportional to the factorial of the number of cells in the 
cluster. This can be avoided by setting the maximum reasonable distance between 

25 cells low enough, and using a sampling frequency that is appropriate, based on 
previous test data and setting new standards for the preparation and assay parameters. 
Use of this "matrix reduction method" allows handling of larger confusion matrices 
of, for example, 20-40 cells, at a fraction of the computational time. 

Alternatively, the efficiency of solving the confusion matrix can be increased 

30 by using the distance matrix to "pre-screen" the confusion matrix elements. This 
method involves excluding any cell identification matches with a quality score at or 
above a user-defined threshold for quality scores (as determined by the distance 
matrix), from the confusion matrix. 
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The cell tracking methods disclosed herein provide information on the 
continuity of cell identification from time point to time point in a kinetic cell 
screening assay. To integrate the information. with a cell screening assay(s), the results 
from the cell tracking methods are preferably managed so that cell and well features, 
5 and kinetic output features, can be associated with the correct cells, Relating assay 
output features to cell identification requires additional data management. Optimal 
computation of kinetic features (cell-based or well-based) depends on a cell data 
management algorithm (Figure 1) that works in conjunction with the cell tracking 
module. The data management serves three important purposes: (1) to dynamically 

10 relate the list of output features and cell ID's to each other; (2) to enable modification 
• of the assay output data by the results of cell tracking; (3) to enable the correct sorting 
of data sets obtained from multiple images. For example, assay data may be 
eliminated for invalid cells. Cells may be marked as invalid if for example they are 
present at some time points but not other time points. 

15 At each time point the cell data need to be rearranged in accordance with 

current cell ID (kinetics cell ID) so that cell kinetic data can be computed. Then, the 
kinetic data need to be realigned with cell ID's again for well statistics to be 
computed. Statistics can be done on all cells in a well or only on fully tracked cells, 
depending on the needs of the user. The data management algorithm keeps track of all 

20 newly identified cells (at any current time point), thus, allowing the user to identify 
the time interval (starting time point arid ending time point) during which each cell 
has been tracked. This, in its turn, makes it possible to flag cells that were fully 
tracked from the beginning of the experiment to the end of it. The ability to select 
cells that fit certain cell ID criteria is valuable for producing optimal kinetic data on 

25 the cell level. While population averaged data may be minimally affected by the loss 
or gain of a few cells, the cell level kinetic data can be dramatically affected by mis- 
identification or by cells that are not detectable throughout the entire experiment. In 
another aspect, the present invention comprises computer readable storage medium 
comprising a program containing a set of instructions for causing a cell screening 

30 system to execute procedures for tracking individual cells during a kinetic cell 

screening assay, wherein the procedures comprise the various method steps of the 

invention. The computer readable medium includes but is not limited to magnetic 

disks, optical disks, organic memory, and any other volatile (ie.g., Random Access 

. Memory ("RAM")) or non-volatile (e.g., Read-Only Memory ("ROM")) mass storage 

20 
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system readable by the CPU. The computer readable medium includes cooperating or 
interconnected computer readable medium, which exist exclusively on the processing 
system or be distributed among multiple interconnected processing systems that may 
be local or remote to the processing system. 

5 In a preferred embodiment, the cell screening system comprises a fluorescence 

optical system with a stage adapted for holding cells and a means for moving the 
stage, a digital camera, a light source for receiving and processing the digital data 
from the digital camera, and a computer means for receiving and processing the 
digital data from the digital camera. This aspect of the invention comprises programs 

10 that instruct the cell screening system to define the organization of the cellular 
component(s) of interest in individual cells, using the methods disclosed herein. 

The methods of the invention can be used in conjunction with any cell-based 
screening assay, including multiparametric assays, that can benefit from kinetic 
analysis. A series of biologically important metabolites, regulatory molecules, and 

15 organelles (such as those shown in Table I), can be labeled with fluorophores and 
activity or concentrations determined by measuring intensity changes over time. A 
majority of these indicators require intact, living cells which inherently change over 
time. Therefore, single cell kinetic intensity measurements are required for high 
content information from these indicators. Most of the small, molecule indicators 

20 listed in Table I (including trademarked indicators) are available from Molecular 
Probes. 



Table 1. 



Intensity Based Indicators of Biom olecular A ctivity 


Target 


Fluorescent Indicator 


Ca" 


Fluo4, FLIPR, Indol, Fura-2 


MeT 


Mg-Fura-2 


Na + 


SBFI 


K + 


PBFI 


cr 


SPQ 


Metal Ions: 

Zn2 + , Cu + , Cu 2+ , Cd 2+ . Hg 2+ ,Ni + , 
Co 2+ , Pb 2+ , Fe 2+ , Fe 3+ , Ba ?+ , As 3+ , 
Tb 3+ , La 2+ 


Calcein, Calcium Green- 1, BTC-5N, 
FITC_Gly,-His, TCCP, TSPP, APTRA-BTC 


PH 


BCECF, SNARF, SNAFL, NERF 


Gene Expression 


GFP-cDNA chimera with gene of choice 


Proliferation and DNA content 


Hoecsht,DAPI 


Viability 


Live/Dead dyes such as CMFDA or Calcein 
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(live)/ Propidium Iodide (dead) 


Membrane Potential 


DiBAC 


Cellular organelles 


MITOTRACKER 1 M , JC-1, , 
LYSOTRACKER™, Fluorescein-Dextran, 
Carbocyanin and ceramide dyes 


Nitric Oxide/Reactive Oxygen 
Species 


Chloro-Fluorescein 


Phosphoinositides 


Bodipy-Inositol 


Cyclic AMP 


PKA Chimeras and covalently labeled proteins 


Multi Drug Resistance transporter 


Doxorubicin, Rhodamine -123 


Protease activity 


Amino-coumarin substrate peptides 


Cell Surface and Intracellular 
Receptors 


Various Fluorescent Ligands 



Ligand Binding 

Ligands for cell surface receptors bind specific extracellular ligands. Some 
native ligands induce molecular function while other exogenous molecules such as 
5 drugs bind, partition in subcellular compartments and modulate biomolecule function. 
Ligands that are fluorescently labeled can be monitored for binding to the cell. 
Fluorescent EGF binding to Epidermal Growth Factor Receptor occurs within a few 
minutes, activating the receptor. After surface binding, the EGF- receptor complex 
internalizes into endosomal compartments, indicating down-regulation and 
10 termination of the signal. Binding and internalization can be detected using the kinetic 
methods of the invention. 

Cell Viability 

Intact plasma membranes can be detected by introducing indicators that pass 
15 through intact cell membranes and are trapped intracellularly by enzymatic removal 
of side groups needed for membrane permeability. Dyes remain trapped, labeling 
cells, unless the plasma membrane is ruptured, releasing the internalized dyes. 
Acetoxymethyl ester derivatives of calcein work well as indicators of intact cell 
membranes and viable cells. Ongoing viability of the cells can be monitored in 
20 conjunction with the kinetic methods of the invention. 

GFP Expression 

The kinetic methods of the invention can be used to monitor expression of 
proteins over time. Many proteins can be fluorescently labeled without perturbing 
25. function by making DNA constructs of the protein of interest that contains additional 
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code for a Green Fluorescent Protein (GFP). These bioreporters are expressed in cells 
to produce functional protein that is fluorescently labeled. These probes would be 
useful as i) a target validation tool with which the levels of potential therapeutic 
targets expressed in genetically engineered cells could be monitored, or ii) a screening 
5 tool with which the effects of compounds on levels of GFP-[Promoter of Interest] 
fusion proteins could be monitored. The time of response is on the order of hours to 
days. 

Nitric Oxide/Reactive Oxygen Species 

10 Nitric Oxide is an important signaling molecules in neuron and endothelial 

cells and controls vascular tone, and cell communication. This application could be 
used as a screening, tool, or as a cytotoxicity tool to monitor production of reactive 
oxygen species. These molecules are important pharmacological targets for stroke, 
Alzheimer's disease, Parkinson's disease and congestive heart failure. The time of 

15 response is on the order of 1 - 10 minutes, and thus could be developed using the 
kinetic methods of the invention. 

Multiple Drug Resistance (MDR) 

This application can be used to monitor the activity of the cell surface 
20 transporter, P-glycoprotein. This is a molecular pump that is embedded in plasma 
membrane and pumps anticancer drugs out of cells, rendering the cells resistant to a 
wide variety of therapeutic agents. The time response for this assay is on the order of 
minutes, and thus could be developed using the kinetic methods of the invention. 
This assay would be applicable to anticancer therapies. 

Lysosome pH 

Fluorescein labeled dextrans are taken up into the cell by endocytosis and end 
up in lysosomal compartments where the dextrans are degraded. The intensity of 
fluorescein is pH dependent and so measuring intensity over time is sufficient to 
30 detect changes in lysosomal activity induced by drugs such as the proton ionophore, 
monensin. 

In a preferred embodiment of the use of the kinetic methods of the invention in 
conjunction with a cell screening assay, cells are segmented by contacting the cells 
with a nuclear label and using information from the nuclear channel. Images from the 
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nuclear channel can optionally be preprocessed (shade corrected and smoothed) and 
are thresholded (using an automatic thresholding procedure), producing a nuclear 
mask. By drawing lines equidistant to nuclei edges (water shed approach) the nuclear 
zones of influence (non touching cellular domain masks) are identified and the mask 
5 of the domains is created. For each nuclear mask, an extended nuclear mask is created 
(nucleus mask is dilated a number of times that is dependent upon the cell type and 
size). The logical "AND" of the mask with corresponding cellular domain results in a 
final mask that is then applied to the second channel to measure the fluorescence 
intensity of the relevant fluorescent marker under the mask. Nuclei are masked and 

10 cells are segmented by defining domains outside of each nuclei with a watershed 
routine. Kinetic features are then determined, based on the changes in intensity in 
individual cells from one measurement to the next, as described above. 

By determining the intensity of the fluorescence emitted by the markers in 
individual cells at various time points, the method provides cell-based, kinetic 

1 5 measurements of one or more of the following: 

Dynamic changes in intensity over time 
Heterogeneity of intensity among cells 
Repetitive oscillations in intensity 
Waves of intensity changes through connected cells 
20 Subpopulations of responding cells 

Sequential activation of signaling molecules 

In a preferred embodiment, the method provides a quantal response of cells 
(i.e.: percent of responding cells with an intensity above a threshold value), which 
25 increases the value of the present assay over those assays that measure only the raw 
amplitude of response. The threshold to be used for a particular parameter can be 
determined for each time point, and the value(s) of the thresholds can be set before the 
scan as an assay input parameter, or can be reset during data analysis. 

30 In a further preferred embodiment of the invention, the kinetic measurement is 

modified, sorted, and/or excluded depending on the quality score for the cell 
identification match for each cell. Sorting includes pooling data for all cells of some 
group, such as fast cells, cells on the 5th image set, cells with red markers, and 
subpopulationis of large cells. 

35 In the case of calcium assays, kinetic features that can be determined include, 

but are not limited to: 
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Cell-based Kinetic Features: 

• Intensity - Cell averaged fluorescent intensity averaged over time 

• Prestim Intensity - The baseline intensity value prior to stimulation by agonist 
(averaged value over all prestim points) 

5 • Peak Intensity - Peak intensity value (Highest point or curve fit to find inflection 
point) 

• Relative Peak Intensity Value - Peak Intensity /Prestim Intensity. 

• Time to Peak Intensity 

• Plateau Intensity 

10 • Relative Plateau Intensity 

• Integrated Intensity of Ca2+ signaling 

• Oscillation frequency 

• Oscillation persistence 

• Oscillation amplitude 

15 

Well-based Kinetic Features: 

• Avg Fluorescent Intensity 

• Avg Baseline Intensity 

• Avg Peak Intensity 

20 • Avg Relative Peak Intensity 

• Avg Time to Peak Intensity . 

• Avg Plateau Intensity to plateau and asymptote 

• Avg Relative Plateau Intensity 

• Avg Integrated Intensity of Ca2+ Signaling 
25 • Avg Oscillation Frequency 

• Avg Oscillation Persistence; 

• Avg Oscillation Amplitude 

30 
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We claim: 

1 . The present invention provides methods and software for tracking individual 
cells during a kinetic cell screening assay, comprising:. 

a) providing cells that possess at least a . first luminescently labeled 
5 reporter molecule that reports on a cell structure; 

b) obtaining a structure image from luminescent signals from the at least 
first luminescently labeled reporter molecule in the cells in a field of view; 

c) creating a structure mask for individual cells in the field of view; 

d) defining a reference point of each structure mask; 

10 e) assigning an cell identification to each reference point in the field of 

view; 

f) repeating steps (b) through (e) at a second time point; 

g) correlating cell identification between the first time point and the 
second time point by calculating a distance between reference points in the field of 

1 5 view at the first time point and reference points in the field of view at the second time 
point; arid 

h) defining a cell identification match by identifying reference points in 
the field of view at the first time point and reference points in the field of view at the 
second time point that are closest together. 

20 

2. The method of claim 1, further comprising repeating steps (f) - (h) a desired 
number of times, wherein determining the distance between reference points is done 
by determining a distance between reference points in successive time points, and 
wherein defining the closest cell identification match is done by defining the closest 

25 cell identification match in successive time points. 

3. The method of claim 1 wherein a cell identification match is rejected if the 
cells identified as a cell identification match are farther apart than a user-defined limit. 

30 4. The method of claim 3 further comprising assigning a quality score to the cell 
identification match based on a distance determined for a second closest cell 
identification match, and wherein a cell identification match is rejected if the quality 
score is below a user-defined threshold for a quality score. 
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5. The method of claim 4 wherein the quality score is calculated by a method 
comprising dividing the difference between the distance between reference points in 
the closest cell identification match from the distance between reference points in the 
second closest cell identification match by the distance between reference points in 

5 the second closest cell identification match. 

6. The method of claim 1 further comprising determining a total sum of all 
distances or distances squared for all possible cell identification matches in successive 
time points, wherein a smallest total sum of all distances or distances squared is 

10 defined as a closest set of cell identification matches. 

7. The method of claim 6 further comprising assigning a quality score to the cell 
identification match based on a sum of distances or distances squared determined for 
a second closest cell identification match, and wherein a cell identification match is 

15 rejected if the quality score is below a user-defined threshold for a quality score. 

8. The method of claim 7 further comprising excluding from the determining a 
total sum of all distances or distances squared for all possible cell identification 
matches in successive time points any cell identification matches with a quality score 

20 at or above a user-defined threshold for quality scores. 

9. The method of claim 8 wherein defining the cell identification match further 
comprises comparing other features of the individual cells between successive time 
points. 

25 

10. The method of claim 9 wherein the features comprise one or more features 
selected from the group consisting of area, shape, size, and luminescent intensity of 
cell or subcellular structures, and exogenous tags associated with cell or subcellular 
structures. 

30 11. The method of claim 10 wherein the feature comprises an exogenous tag, and 
wherein the exogenous tag comprises a bar coding tag. 

12. The method of claim 10 further comprising excluding any cell identification 

matches with a quality score at or above a user-defined threshold for quality scores 
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from the comparing other characteristic features of the individual cells between 
successive time points. 

13 The method of claim 10 further comprising excluding any cell identification 
5 matches with a quality score below a user-defined threshold for quality scores from 
the comparing other characteristic features of the individual cells between successive 
time points. , 

14. The method of claim 4 wherein the quality score is averaged over multiple 
10 time points and applied to a later time point as an average quality score. 

15. The method of claim 7 further comprising determining a distance moved by 
the individual cell in successive time points. 

15 16 The method of claim 15 further comprising determining an average distance 
moved by an individual cell over multiple time points. 

17. The method of claim 1 wherein the cell structure is a nucleus, wherein the 
structure image is a nuclear image, and wherein the structure mask is a nuclear mask. 

20 

18. The method of claim 1, wherein the cell screening assay comprises one or 
more assays for the kinetic analysis of a cell parameter selected from the group 
consisting of ionic concentration, pH, gene expression, DNA proliferation, DNA 
content, cell viability, membrane potential, production of reactive oxygen species, 

25 enzyme activity, receptor activation, ligand binding, and transporter activity. 

19. The method of claim 18, wherein the cells further possess at least a second 
luminescently labeled reporter molecule that reports on the cell parameter, and 
wherein the method further comprises obtaining luminescent signals from the second 
luminescently labeled reporter molecule and calculating a kinetic measure of the 

30 luminescent signals from the second luminescently labeled reporter molecule in 

individual cells, wherein the kinetic measure is selected from the group consisting of 

dynamic changes in intensity over time, heterogeneity of intensity among cells, 

oscillations in intensity, waves of intensity changes through connected cells, 

subpopulations of responding cells, and sequential activation of signaling molecules. 
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20. The method of claim 19, wherein the kinetic measure is modified, sorted, 
and/or excluded depending on a quality score for the cell identification match for each 
cell. 

5 

21. A computer readable storage medium comprising a program containing a set 
of instructions for causing a cell screening system to execute procedures for tracking 
individual cells during a kinetic cell screening assay, wherein the procedures comprise 

a) providing cells that possess at least a first luminescently labeled 
10 reporter molecule that reports on a cell structure; 

b) obtaining a structure image from luminescent signals from the at least 
first luminescently labeled reporter molecule in the cells in a field of view; 

c) creating a structure mask for individual cells in the field of view; 

d) defining a reference point of each structure mask; 

15 e) assigning an cell identification to each reference point in the field of 

view; 

f) repeating steps (b) through (e) at a second time point; 

g) correlating cell identification between the first time point and the 
second time point by calculating a distance between reference points in the field of 

20 view at the first time point and reference points in the field of view at the second time 
point; and 

h) defining a cell identification match by identifying reference points in 
the field of view at the first time point and reference points in the field of view at the 
second time point that are closest together. 

25 



29 



WO 02/061423 



PCT7US01/49928 



1/1 



c 



rl 



Begin 







•j For each timepoint 






Run fixed endpoint assay j 






For each cell 



Yes 



C 



No 



Reconcile kinetic IDs 
+ 



Get cell start timepoint index 



Get cell end timepoint index 



Set cell tracking status 



Calculate kinetic cell features 




Calculate well kinetic features 




End 



Figure 1 



BNSOOCIO: <WO 



02061 423A2 I > 



(12) INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(19) World Intellectual Property Organization 
International Bureau 

(43) International Publication Date 
8 August 2002 (08.08.2002) 




PCT 



(10) International Publication Number 

WO 02/061423 A3 



(51) International Patent Classification 7 : G01N 33/50, 
G06T 19/00, COIN 15/14, 21/64 

(21) International Application Number: PCT/US0 1/499 28 

(22) International Filing Date: 

21 December 2001 (21.12.2001) 



(25) Filing Language: 

(26) Publication Language: 



English 
English 



(30) Priority Data: 

60/258*147 . 22 December 2000 (22.12.2000) US 

(71) Applicant tjbr all designated States except US): CEL- 
LOMICS, INC. [USAJSI: 635 William Pitt Way, Pitts- 
burgh, PA 15238 (US). 



(72) Inventors; and 

(75) Inventors/Applicants (for US only): SAMMAK, Paul 

[US/US]; 551 Olive Street, Pittsburgh, PA 15237 (US). 
ROSANIA, Gustavo [CO/US]; 1805 Vinan Kay Cir- 
cle, Ann Arbor, MI 48103 (US). RUBIN, Richard 
[US/US]; 216 Gladstone Road, Pittsburgh, PA 15238 
(US). NEDERLOF, Michel [BE/US]; 1502 Fox Chapel 
Road, Pittsburg, PA 15238 (US). LA PETS, Oleg, P. 
[RU/US]; Shady Oak Circle, Allison Park, PA 15101 (US). 
SHOPOFF, Randall, O. [US/US]; 113 Country Club 
Drive, Pittsburg, PA 15235 (US). KANNAN, Murugan 
[IN/US]; 8988 Meadow Oaks Drive, Allison Park, PA 
15101 (US). 

(74) Agent: HARPER, David, S.; McDonnell Boehnen Hul- 
bert & Berghoff, Suite 3200, 300 South Wacker Drive, 
Chicago, IL 60606 (US). 



[Continued on next page] 



(54) Title: IDENTIFICATION OF INDIVIDUAL CELLS DURING KINETIC ASSAYS 



< 
m 

rH 

O 



c 


Begin 






r 


*| For each tim epoinl | 




| Run fixed en 


dpoint assay j 



For each ceil 

31 



Reconcile kinetic IDs 



Get cell start timepoint index 



| Get cell end timepoint index 



I " = Set cell tracking status 



[~ Calculate kinetic cell features 




| Calculate well kinetic features | 




End 



(57) Abstract: The present invention provides methods 
and software for tracking individual cells during a kinetic 
cell screening assay. 



oucnnrirv 



WO 02/061423 A3 llllBlIHBllilMlllIllllllllllllli 



(81) Designated States (national)*. AE, AG, AL, AM, AT, AU, 
AZ, BA, BB, BG, BR, BY, BZ, CA, CH, CN, CO, CR, CU, 
CZ, DE, DK, DM, DZ, EC, EE, ES, FI, GB, GD, GE; GH, 
GM, HR, HU, ID, IL, IN, IS, JP, KE, KG, KP, KR, KZ, LC, 
LK, LR, LS, LT, LU, LV, MA, MD, MG, MK, MN, MW, 
MX, MZ, NO, NZ, OM, PH, PL, PT, RO, RU, SD, SE, SG, 
SI, SK, SL, TJ, TM, TN, TR, TT, TZ, UA, UG, US, UZ, 
VN, YU, ZA, ZM, ZW. 

(84) Designated States (regional): ARIPO patent (GH, GM, 
KE, LS, MW, MZ, SD, SL, SZ, TZ, UG, ZM, ZW), 
Eurasian pateni (AM, AZ, BY, KG, KZ, MD, RU, TJ, TM), 
European paicni (AT, BE, CH, CY, DE, DK, ES, Fl, FR, 



GB, GR, IE, IT, LU, MC, NL, PT, SE, TR), OAPI patent 
(BF, BJ, CF, CG, CI, CM, GA, GN, GQ, GW, ML, MR, 
NE, SN, TD, TG). 

Published: 

— with international search report 

(88) Date of publication of the international search report: 

20 February 2003 

For two-letter codes and other abbreviations, refer to the "Guid- 
ance Notes on Codes and Abbreviations" appearing at the begin- 
ning of each regular issue of the PCT Gazette. 



INTERNATIONAL SEARCH REPORT 



Intc al Application No 

PCT/US 01/49928 



A. CLASSIFICATION OF SUBJECT MATTER 

IPC 7 G01N33/50 G06F19/00 G01N15/14 G01N21/64 



According to International Pateni Classification (IPC) or to both national classification and IPC 



B. FIELDS SEARCHED 



Minimum documentation searched (classification system followed by classification symbols) 

IPC 7 G01N 



Documentation searched other than minimum documentation to the extent that such documents are included in the fields searched 



Electronic data base consulted during the international search (name of data base and. where practical, search terms used) 

EPO-Internal, WPI Data, PAJ, INSPEC, COMPENDEX, BIOSIS, MEDLINE 



C. DOCUMENTS CONSIDERED TO BE RELEVANT 



Category a Citation of document, with indication, where appropriate, of the relevant passages 



Relevant to daim No. 



WO 00 72258 A (CELLOMICS, INC.) 
30 November 2000 (2000-11-30) 
the whole document 



US 5 973 732 A (GUTHRIE , T.C. 
26 October 1999 (1999-10-26) 
the whole document 



) 



GIULIANO K A ET AL: "HIGH-CONTENT 

SCREENING: A NEW APPROACH TO EASING KEY 

BOTTLENECKS IN THE DRUG DISCOVERY PROCESS" 

JOURNAL OF BI0M0LECULAR SCREENING , 

LARCHMONT , NY, US, 

vol. 4, no. 2, 1997, pages 249-259, 

XP002952092 

ISSN: 1087-0571 

the whole document 



1-21 



1-21 



1-21 



m 



Further documents are listed in the continuation of box C. 



Patent family members are listed In annex 



• Special categories of cited documents : 

•A* document defining the general state of the art which is not 

considered to be of particular relevance 
*E* earfer document but published on ox after the international 

filing date 

*L # document which may throw doubts on priority ctatm(s) or 
which is cited to establish the publication date of another 
citation or other special reason (as specified) 

•O' document referring to an oral disclosure, use. exhibition or 
other means 

'P* document published prior to Ihe international filing date but 
later than the priority date claimed 



T later document published after the international firing date 
or priority date and not in conflict with the application but 
cited to understand the principle or theory underlying the 
invention 

•X' document of particular relevance; the claimed invention 
cannot be considered novel or cannot be considered to 
involve an inventive step when the document is taken atone 

"Y" document of particular relevance; Ihe claimed invention 

cannot be considered to involve an inventive step when the 
document is combined with one or more other such docu- 
ments, such combination being obvious to a person skilled 
in the art. 

'&' document member of Ihe same patent family 



Date of Ihe actual completion of the international search 



25 October 2002 



Dale ol mailing ot Ihe international search report 



11/11/2002 



Name and mailing address of the ISA 

European Pateni Office. P.B. 5818 Patent laan 2 
NL-2260HVRijswqk 
Tel. (+31-70) 340-2040. Tx. 31 651 epo nl. 
Fax (+31-70) 340-3016 



Authorized officer 



Thumb, W 



FpnnPCT/lSV210{! 



sheet) (Jury 1992) 



page 1 of 2 



_02061423A3J_> 



INTERNATIONAL SEARCH REPORT 



Int . >nal Application No 

PCT/US 01/49928 



(^Continuation) DOCUMENTS CONSIDERED TO BE RELEVANT 



Category • I C&alion of document, with indication.where appropriate, of the relevant passages 



Retevanl to claim No. 



MCNALLY .0 6: "Computational 
optical-sectioning microscopy for 3-D 
quantification of cell motion: Results and 
challenges" 

IMAGE RECONSTRUCTION AND RESTORATION, SAN 

DIEGO, CA, USA, 25-26 JULY 1994, 

vol. 2302, pages 342-351, XP001118842 

Proceedings of the SPIE - The 

International Society for Optical 

Engineering, 1994, USA 

ISSN: 0277-786X 

page 347, paragraph 5 -page 348, paragraph 

3 



1-2.1 



Form PCT/ISA/210 (continuation oi second sheet) (July 1992) 



page 2 of 2 



INTERNATIONAL SEARCH REPORT 

Information on patent family members 



lnt< 181 Application No 

PCT/US 01/49928 



Patent document 
cited in search report 



Publication 
date 



Patent family 
member(s) 



Publication 
date 



WO 0072258 



30-11-2000 



AU 
EP 
MO 



5284800 A 
1138019 A2 
0072258 A2 



12-12-2000 
04-10-2001 
30-11-2000 



US 5973732 



26-10-1999 CA 



2229916 Al 



19-08-1998 



RM.onnnirv <-wn 



Form FCT/IS/V210 (patent lamiry amu) (Juty 1992) 



This Page is Inserted by IFW Indexing and Scanning 
Operations and is not part of the Official Record 

BEST AVAILABLE IMAGES 

Defective images within this document are accurate representations of the original 
documents submitted by the applicant. 

Defects in the images include but are not limited to the items checked: 

□ BLACK BORDERS 

□ IMAGE CUT OFF AT TOP, BOTTOM OR SIDES 

□ FADED TEXT OR DRAWING 

□ BLURRED OR ILLEGIBLE TEXT OR DRAWING 

□ SKEWED/SLANTED IMAGES 

□ COLOR OR BLACK AND WHITE PHOTOGRAPHS 

□ GRAY SCALE DOCUMENTS 

□ LINES OR MARKS ON ORIGINAL DOCUMENT 

□ REFERENCE(S) OR EXHIBIT(S) SUBMITTED ARE POOR QUALITY 

□ OTHER: 

IMAGES ARE BEST AVAILABLE COPY. 
As rescanning these documents will not correct the image 
problems checked, please do not report these problems to 
the IFW Image Problem Mailbox. 



